Abstract
These are the results and figures.
Give a basic summary of species richness stats. For example, the regions with the lowest and highest long-term averages in species richness are:
kable(comm_master[,mean(reg_rich), by='reg'][reg%in%c("gmex","shelf")])
| reg | V1 |
|---|---|
| gmex | 142.28522 |
| shelf | 46.29991 |
Can also measure long-term variability of the different regions. Here is the standard deviation for each region:
kable(comm_master[,stats::sd(reg_rich),by='reg'])
| reg | V1 |
|---|---|
| ai | 2.444883 |
| ebs | 5.566457 |
| gmex | 4.266274 |
| goa | 8.506057 |
| neus | 8.237090 |
| newf | 4.365340 |
| sa | 4.035492 |
| shelf | 3.246860 |
| wctri | 7.153178 |
And here is the cross-region average of those long-term standard deviations:
kable(comm_master[,stats::sd(reg_rich),by='reg'][,mean(V1)])
| 5.313515 |
richness_ts()
Figure 1. Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.
naive_msom_scatter()
Figure S1. MSOM richness vs naive richness
MSOM richness and Naive richness are pretty similar. MSOM richness is probably more accurate, or is at least more conservative b/c it has fewer significant trends. But their similarity should help justify using observed presences/ absences in other analyses.
Manuscript paragraph:
MSOM estimates of richness were greater than Naive estimates, but the two methods produced similar temporal dynamics in richness (Figure S1). Henceforth, we report MSOM richness estimates. The greatest long-term average richness was the Gulf of Mexico (142.3), lowest in the Scotian Shelf (46.3). The inter-region average of long-term standard deviations of richness was 5.3; the Aleutian Islands showed the lowest variability (sd = 2.4), and the Gulf of Alaska was the most variable (sd = 8.5).
Examine trends for naive richness:
load("../../pkgBuild/results/rich_naive_trend_kendall.RData")
rich_naive_trend_kendall[reg!="wcann",BH:=p.adjust(taup, method='BH')]
rich_naive_trend_kendall <- rich_naive_trend_kendall[reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=taup)]
kable(rich_naive_trend_kendall)
| reg | estimate | BH | p.value |
|---|---|---|---|
| ai | -0.0606061 | 0.8370115 | 0.8370115 |
| ebs | 0.5010753 | 0.0001810 | 0.0000805 |
| gmex | 0.8529413 | 0.0000097 | 0.0000021 |
| goa | 0.2051282 | 0.4051368 | 0.3601216 |
| neus | 0.3368132 | 0.0103196 | 0.0080264 |
| newf | 0.7865834 | 0.0001112 | 0.0000371 |
| sa | -0.5685352 | 0.0002831 | 0.0001573 |
| shelf | 0.7048780 | 0.0000000 | 0.0000000 |
| wctri | 0.7640932 | 0.0045626 | 0.0030417 |
In the Naive estimates, 7 regions had significant \(\tau_b\).
Examine trends for MSOM richness:
load("../../pkgBuild/results/rich_trend_kendall.RData")
rich_trend_kendall[reg!="wcann",BH:=p.adjust(pvalue, method="BH")]
rich_trend_kendall <- rich_trend_kendall[reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=pvalue)]
kable(rich_trend_kendall)
| reg | estimate | BH | p.value |
|---|---|---|---|
| ebs | 0.4206052 | 0.0028542 | 0.0009514 |
| ai | 0.2317340 | 0.3652207 | 0.3246406 |
| goa | 0.2419794 | 0.3649750 | 0.2838694 |
| wctri | 0.6094305 | 0.0417890 | 0.0185729 |
| gmex | 0.0976038 | 0.5996890 | 0.5996890 |
| sa | -0.2179532 | 0.2121777 | 0.1414518 |
| neus | 0.1912424 | 0.2121777 | 0.1325807 |
| shelf | 0.4524160 | 0.0004334 | 0.0000482 |
| newf | 0.7267104 | 0.0006228 | 0.0001384 |
In the MSOM estimates, 4 regions had significant \(\tau_{b}\).
Overall, ~half the regions show positive trends. No regions show significant negative trends (although SEUS is close, depending on the analysis). It’s a bit surprising how much removing the autocorrelation seemed to impact some of the trends (AI in particular, I think).
Manuscript paragraph:
Estimated slopes for long-term trends (Kendall’s τb) in richness were positive for most regions. For Naïve estimates, all the seven positive τ were also significantly different from 0, whereas the two negative τb, Aleutian Islands and Southeast US, were not (Table S2). Long-term trends in MSOM estimates of richness had the same sign as Naïve trends, except for Aleutian Islands, but now the only significant τb were the following four regions: Eastern Bering Sea (τb = 0.41), West Coast US (τb = 0.61), Scotian Shelf (τb = 0.45), and Newfoundland (τb = 0.72) (Table 1). Richness trends were not significant in the Gulf of Mexico, Gulf of Alaska, and Northeast US, Aleutian Islands, Southeast US.
Looking for spatial footprint that will predict richness. Here I characterize the geographic distribution of a species in two ways: its density and its size. This is tricky for richness because richness is a community level trait, but I’ve characterized these two aspects of geographic distribution at the species level. In fact, these “species level” attributes are also potentially dyanmic – a species need not have fixed range density or size. So there are two levels of averaging – for each species take its long-term average value, then for each community take the average across species.
First I’ll show a figure relating range size and range density. Then I’ll present the figure of how size and density can predict richness. Finally, I’ll explore models more formally expressing the relationship between richness and size/ density.
rangeSizeDens()
Figure S2. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species’ long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.
These plots show that there is a strong relationship between range size and range density. Interestingly, in Figure S2B, the cross-region relationship is negative (if each color had 1 pt), whereas the within-region relationship is positive.
The positive relationship between size and density is not surprising. My interpretation of density is ~population size. Often population size is correlated with range size. I think this is a standard result, but I need to double-check.
rich_geoRange()
Figure 2. Species richness vs A) geographic range size, and B) geographic range density. Both metrics are based on each species’ long-term average of a statistic; range size is the proportion of sites occupied, range density is the proportion of tows in occupied sites. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.
Both range size and range density are pretty good predictors of species richness. I think I had originally missed the range size relationship b/c I hadn’t done the same aggregating procedure. The interpretation I have is that richness is highest when you have a bunch of rare species.
The goal here is to see if species richness is predicted by the typical range density of community’s constituent species. First I’ll run different types of models just to explore whether this is true, in general (across regions). Then I’ll drill in to each region individually to answer the same question.
# This is a function that'll help summarize model fit and coefficients and parameter significance:
mod_smry <- function(m, pred_name=c("density","size")){
pred_name <- match.arg(pred_name)
sc <- sem.coefs(m)
mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
mod_call <- as.character(mod_call)[2]
out <- cbind(
mod_call = mod_call,
sc[sc[,"predictor"]==pred_name,],
sem.model.fits(m)
)
out[,c("Class","N","mod_call","predictor","estimate","std.error","p.value","Marginal","Conditional")]
}
# Make a data set that is useful for these regressions (short variable names, etc):
range_reg <- comm_master[,list(reg, year, rich=reg_rich, density=propTow_occ_avg, size=propStrata_avg_ltAvg)]
# Fit different models to the whole data set
rich_dens_mods <- list()
rich_dens_mods[[1]] <- lm(rich ~ 1 + density, data=range_reg)
rich_dens_mods[[2]] <- lm(rich ~ 1 + density*reg, data=range_reg)
rich_dens_mods[[3]] <- lmer(rich ~ 1 + density + (1|reg), data=range_reg)
rich_dens_mods[[4]] <- lmer(rich ~ 1 + density + (1+density|reg), data=range_reg)
rich_dens_smry <- rbindlist(lapply(rich_dens_mods, mod_smry, pred_name="density"))
# Fit same model to each region separately
rich_dens_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
rich_dens_reg_mods[[r]] <- lm(rich ~ 1 + density, data=range_reg[reg==ur[r]])
}
rich_dens_reg_smry <- data.table(reg=ur, rbindlist(lapply(rich_dens_reg_mods, mod_smry, pred_name="density")))
# unlist(lapply(rich_dens_mods, stargazer, type='html'))
stargazer(rich_dens_mods[[1]], rich_dens_mods[[2]], rich_dens_mods[[3]], rich_dens_mods[[4]], type='html')
| Dependent variable: | ||||
| rich | ||||
| OLS | linear | |||
| mixed-effects | ||||
| (1) | (2) | (3) | (4) | |
| density | -94.071*** | -341.590** | -525.158*** | -664.300*** |
| (17.038) | (140.711) | (36.360) | (96.041) | |
| regebs | 753.130*** | |||
| (132.497) | ||||
| reggmex | 284.278** | |||
| (116.047) | ||||
| reggoa | 192.097** | |||
| (92.351) | ||||
| regneus | 216.220** | |||
| (88.588) | ||||
| regnewf | 365.078** | |||
| (150.048) | ||||
| regsa | 31.548 | |||
| (100.847) | ||||
| regshelf | -30.070 | |||
| (86.972) | ||||
| regwctri | 229.957** | |||
| (105.721) | ||||
| density:regebs | -802.601*** | |||
| (193.269) | ||||
| density:reggmex | -728.052*** | |||
| (264.670) | ||||
| density:reggoa | -535.503*** | |||
| (177.692) | ||||
| density:regneus | -352.184** | |||
| (156.568) | ||||
| density:regnewf | -496.211** | |||
| (237.666) | ||||
| density:regsa | -21.417 | |||
| (184.284) | ||||
| density:regshelf | 48.959 | |||
| (149.262) | ||||
| density:regwctri | -444.658** | |||
| (193.577) | ||||
| Constant | 143.267*** | 251.093*** | 375.585*** | 455.269*** |
| (9.843) | (81.730) | (28.308) | (73.939) | |
| Observations | 197 | 197 | 197 | 197 |
| R2 | 0.135 | 0.990 | ||
| Adjusted R2 | 0.131 | 0.990 | ||
| Log Likelihood | -571.773 | -554.875 | ||
| Akaike Inf. Crit. | 1,151.546 | 1,121.749 | ||
| Bayesian Inf. Crit. | 1,164.679 | 1,141.448 | ||
| Residual Std. Error | 29.771 (df = 195) | 3.259 (df = 179) | ||
| F Statistic | 30.485*** (df = 1; 195) | 1,096.211*** (df = 17; 179) | ||
| Note: | p<0.1; p<0.05; p<0.01 | |||
kable(rich_dens_smry) # different kinds of models
| Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|
| lm | 197 | rich ~ 1 + density | density | -94.0708 | 17.03765 | 0.0000 | 0.1351986 | NA |
| lm | 197 | rich ~ 1 + density * reg | density | -341.5902 | 140.71094 | 0.0162 | 0.9904861 | NA |
| lmerMod | 197 | rich ~ 1 + density + (1 | reg) | density | -525.1580 | 36.35966 | 0.0000 | 0.5358370 | 0.9981927 |
| lmerMod | 197 | rich ~ 1 + density + (1 + density | reg) | density | -664.3002 | 96.04126 | 0.0000 | 0.3493638 | 0.9994551 |
kable(rich_dens_reg_smry) # same model applied to each reg sep
| reg | Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|---|
| ai | lm | 12 | rich ~ 1 + density | density | -341.5902 | 24.24775 | 0.0000 | 0.9520286 | NA |
| ebs | lm | 31 | rich ~ 1 + density | density | -1144.1910 | 88.47021 | 0.0000 | 0.8522400 | NA |
| gmex | lm | 17 | rich ~ 1 + density | density | -1069.6426 | 124.78482 | 0.0000 | 0.8304652 | NA |
| goa | lm | 13 | rich ~ 1 + density | density | -877.0932 | 132.51898 | 0.0000 | 0.7992927 | NA |
| neus | lm | 32 | rich ~ 1 + density | density | -693.7744 | 122.76529 | 0.0000 | 0.5156318 | NA |
| newf | lm | 16 | rich ~ 1 + density | density | -837.8009 | 142.75834 | 0.0000 | 0.7109900 | NA |
| sa | lm | 25 | rich ~ 1 + density | density | -363.0073 | 130.10022 | 0.0104 | 0.2528899 | NA |
| shelf | lm | 41 | rich ~ 1 + density | density | -292.6311 | 18.11578 | 0.0000 | 0.8699704 | NA |
| wctri | lm | 10 | rich ~ 1 + density | density | -786.2482 | 136.01170 | 0.0004 | 0.8068424 | NA |
kable(as.data.frame(as.list( # avg of above
sapply(rich_dens_reg_smry, function(x)suppressWarnings(mean(x)))
)))
| reg | Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|---|
| NA | NA | 21.88889 | NA | NA | -711.7754 | 102.197 | 0.0012 | 0.7322612 | NA |
All models are pretty good predictors. Well, the most basic model kinda sucks I guess. It needs to account for some of the between-region variation.
Same as above, but now let’s look at range size as a predictor of species richness.
“Is range size a good predictor of species richness?”
rich_size_mods <- list()
rich_size_mods[[1]] <- lm(rich ~ 1 + size, data=range_reg)
rich_size_mods[[2]] <- lm(rich ~ 1 + size*reg, data=range_reg)
rich_size_mods[[3]] <- lmer(rich ~ 1 + size + (1|reg), data=range_reg)
rich_size_mods[[4]] <- lmer(rich ~ 1 + size + (1+size|reg), data=range_reg)
rich_size_smry <- rbindlist(lapply(rich_size_mods, mod_smry, pred_name="size"))
rich_size_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
rich_size_reg_mods[[r]] <- lm(rich ~ 1 + size, data=range_reg[reg==ur[r]])
}
rich_size_reg_smry <- data.table(reg=ur, rbindlist(lapply(rich_size_reg_mods, mod_smry, pred_name="size")))
# unlist(lapply(rich_size_mods, stargazer, type='html'))
stargazer(rich_size_mods[[1]], rich_size_mods[[2]], rich_size_mods[[3]], rich_size_mods[[4]], type='html')
| Dependent variable: | ||||
| rich | ||||
| OLS | linear | |||
| mixed-effects | ||||
| (1) | (2) | (3) | (4) | |
| size | -41.744 | -298.714*** | -320.492*** | -446.741*** |
| (27.019) | (108.160) | (20.673) | (71.160) | |
| regebs | 89.396** | |||
| (38.336) | ||||
| reggmex | 166.444*** | |||
| (46.423) | ||||
| reggoa | 73.846* | |||
| (38.392) | ||||
| regneus | 111.614*** | |||
| (37.349) | ||||
| regnewf | 30.344 | |||
| (40.702) | ||||
| regsa | 58.769 | |||
| (42.746) | ||||
| regshelf | -61.244* | |||
| (36.886) | ||||
| regwctri | 61.122 | |||
| (39.410) | ||||
| size:regebs | -297.732** | |||
| (119.580) | ||||
| size:reggmex | -197.781 | |||
| (134.793) | ||||
| size:reggoa | -185.302 | |||
| (116.924) | ||||
| size:regneus | -543.256*** | |||
| (118.393) | ||||
| size:regnewf | -256.003* | |||
| (137.719) | ||||
| size:regsa | 49.873 | |||
| (119.451) | ||||
| size:regshelf | 144.584 | |||
| (109.847) | ||||
| size:regwctri | -85.911 | |||
| (116.901) | ||||
| Constant | 102.122*** | 153.321*** | 185.206*** | 211.391*** |
| (8.045) | (36.441) | (15.181) | (21.935) | |
| Observations | 197 | 197 | 197 | 197 |
| R2 | 0.012 | 0.995 | ||
| Adjusted R2 | 0.007 | 0.994 | ||
| Log Likelihood | -562.694 | -496.586 | ||
| Akaike Inf. Crit. | 1,133.389 | 1,005.172 | ||
| Bayesian Inf. Crit. | 1,146.522 | 1,024.871 | ||
| Residual Std. Error | 31.819 (df = 195) | 2.421 (df = 179) | ||
| F Statistic | 2.387 (df = 1; 195) | 1,994.843*** (df = 17; 179) | ||
| Note: | p<0.1; p<0.05; p<0.01 | |||
kable(rich_size_smry)
| Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|
| lm | 197 | rich ~ 1 + size | size | -41.74371 | 27.01868 | 0.1240 | 0.0120930 | NA |
| lm | 197 | rich ~ 1 + size * reg | size | -298.71408 | 108.16038 | 0.0063 | 0.9947494 | NA |
| lmerMod | 197 | rich ~ 1 + size + (1 | reg) | size | -320.49227 | 20.67327 | 0.0000 | 0.2932665 | 0.9945506 |
| lmerMod | 197 | rich ~ 1 + size + (1 + size | reg) | size | -446.74051 | 71.15995 | 0.0000 | 0.3319367 | 0.9986230 |
kable(rich_size_reg_smry)
| reg | Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|---|
| ai | lm | 12 | rich ~ 1 + size | size | -298.7141 | 64.79745 | 0.0010 | 0.6800184 | NA |
| ebs | lm | 31 | rich ~ 1 + size | size | -596.4463 | 44.20232 | 0.0000 | 0.8626087 | NA |
| gmex | lm | 17 | rich ~ 1 + size | size | -496.4955 | 70.67676 | 0.0000 | 0.7668957 | NA |
| goa | lm | 13 | rich ~ 1 + size | size | -484.0165 | 72.54219 | 0.0000 | 0.8018673 | NA |
| neus | lm | 32 | rich ~ 1 + size | size | -841.9705 | 64.00081 | 0.0000 | 0.8522680 | NA |
| newf | lm | 16 | rich ~ 1 + size | size | -554.7167 | 57.74506 | 0.0000 | 0.8682739 | NA |
| sa | lm | 25 | rich ~ 1 + size | size | -248.8411 | 68.97584 | 0.0015 | 0.3613805 | NA |
| shelf | lm | 41 | rich ~ 1 + size | size | -154.1298 | 8.31788 | 0.0000 | 0.8980015 | NA |
| wctri | lm | 10 | rich ~ 1 + size | size | -384.6251 | 28.71723 | 0.0000 | 0.9573075 | NA |
kable(as.data.frame(as.list(sapply(rich_size_reg_smry, function(x)suppressWarnings(mean(x))))))
| reg | Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|---|
| NA | NA | 21.88889 | NA | NA | -451.1062 | 53.33061 | 0.0002778 | 0.7831802 | NA |
As far as picking one or the other, it doesn’t end up mattering much. Range size is a lot better than density in NEUS, and density outperforms size in AI. Otherwise, size as a slight edge over density on average, although both predictors are significant in all regions.
rich_ds_smry <- rbind(rich_dens_reg_smry, rich_size_reg_smry)
setkey(rich_ds_smry, reg, Class, predictor)
kable(rich_ds_smry)
| reg | Class | N | mod_call | predictor | estimate | std.error | p.value | Marginal | Conditional |
|---|---|---|---|---|---|---|---|---|---|
| ai | lm | 12 | rich ~ 1 + density | density | -341.5902 | 24.24775 | 0.0000 | 0.9520286 | NA |
| ai | lm | 12 | rich ~ 1 + size | size | -298.7141 | 64.79745 | 0.0010 | 0.6800184 | NA |
| ebs | lm | 31 | rich ~ 1 + density | density | -1144.1910 | 88.47021 | 0.0000 | 0.8522400 | NA |
| ebs | lm | 31 | rich ~ 1 + size | size | -596.4463 | 44.20232 | 0.0000 | 0.8626087 | NA |
| gmex | lm | 17 | rich ~ 1 + density | density | -1069.6426 | 124.78482 | 0.0000 | 0.8304652 | NA |
| gmex | lm | 17 | rich ~ 1 + size | size | -496.4955 | 70.67676 | 0.0000 | 0.7668957 | NA |
| goa | lm | 13 | rich ~ 1 + density | density | -877.0932 | 132.51898 | 0.0000 | 0.7992927 | NA |
| goa | lm | 13 | rich ~ 1 + size | size | -484.0165 | 72.54219 | 0.0000 | 0.8018673 | NA |
| neus | lm | 32 | rich ~ 1 + density | density | -693.7744 | 122.76529 | 0.0000 | 0.5156318 | NA |
| neus | lm | 32 | rich ~ 1 + size | size | -841.9705 | 64.00081 | 0.0000 | 0.8522680 | NA |
| newf | lm | 16 | rich ~ 1 + density | density | -837.8009 | 142.75834 | 0.0000 | 0.7109900 | NA |
| newf | lm | 16 | rich ~ 1 + size | size | -554.7167 | 57.74506 | 0.0000 | 0.8682739 | NA |
| sa | lm | 25 | rich ~ 1 + density | density | -363.0073 | 130.10022 | 0.0104 | 0.2528899 | NA |
| sa | lm | 25 | rich ~ 1 + size | size | -248.8411 | 68.97584 | 0.0015 | 0.3613805 | NA |
| shelf | lm | 41 | rich ~ 1 + density | density | -292.6311 | 18.11578 | 0.0000 | 0.8699704 | NA |
| shelf | lm | 41 | rich ~ 1 + size | size | -154.1298 | 8.31788 | 0.0000 | 0.8980015 | NA |
| wctri | lm | 10 | rich ~ 1 + density | density | -786.2482 | 136.01170 | 0.0004 | 0.8068424 | NA |
| wctri | lm | 10 | rich ~ 1 + size | size | -384.6251 | 28.71723 | 0.0000 | 0.9573075 | NA |
fight <- rich_ds_smry[,list(marginal_density_minus_size=.SD[predictor=="density", Marginal] - .SD[predictor=="size", Marginal]),by='reg']
kable(data.table(fight, mu=fight[,mean(marginal_density_minus_size)]))
| reg | marginal_density_minus_size | mu |
|---|---|---|
| ai | 0.2720102 | -0.0509189 |
| ebs | -0.0103687 | -0.0509189 |
| gmex | 0.0635695 | -0.0509189 |
| goa | -0.0025746 | -0.0509189 |
| neus | -0.3366362 | -0.0509189 |
| newf | -0.1572839 | -0.0509189 |
| sa | -0.1084906 | -0.0509189 |
| shelf | -0.0280311 | -0.0509189 |
| wctri | -0.1504652 | -0.0509189 |
The two metrics of geographic range are well correlated. Furthermore, richness can be predicted pretty well using regressions with either as a predictor. There are large differences among regions, though. This is probably because richness is not readily comparable among most regions. Regions vary mostly in their intercept values, and they have fairly similar slopes (though they are not identical, and model fits improve when allowing slopes to vary among regions; it’s just that the improvement is small compared to allowing intercepts to vary among regions).
The interpretation of the result that geographic distribution predicts species richness is likely associated with species rarity. When the average range density or range size of a community is low, it means it has a lot of species that are rare (at either spatial scale). It’s these rare species that come and go, and form the dynamics of richness that we observe. When that dynamical value is high, it implies that an above-average number of the dynamic species are present. Because those species are transient (dynamic), they are also probably rare.
categ_barplot()
# And here's the table behind the barplot:
kable(t(
spp_master[!duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
)[c(4,1,2,3),])
| ai | ebs | gmex | goa | neus | newf | sa | shelf | wctri | |
|---|---|---|---|---|---|---|---|---|---|
| neither | 39 | 64 | 105 | 63 | 56 | 49 | 69 | 27 | 64 |
| both | 6 | 44 | 31 | 17 | 83 | 12 | 35 | 20 | 15 |
| colonizer | 9 | 2 | 8 | 18 | 1 | 10 | 0 | 1 | 12 |
| leaver | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 |
# Or, to see the most common categories overall:
kable(data.frame(as.list(
rowSums(spp_master[!duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)][,c(4,1,2,3)])
)))
| ai | ebs | gmex | goa | neus | newf | sa | shelf | wctri |
|---|---|---|---|---|---|---|---|---|
| 55 | 110 | 144 | 98 | 141 | 72 | 104 | 48 | 92 |
# Does that pattern change if we just look at regions with positive slopes?
pos_reg <- c("ebs","newf","shelf","wctri")
kable(data.frame(as.list(
rowSums(t(
spp_master[reg%in%pos_reg & !duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
)[c(4,1,2,3),])
)))
| neither | both | colonizer | leaver |
|---|---|---|---|
| 204 | 91 | 25 | 2 |
kable(data.frame(as.list(
rowSums(t(
spp_master[!reg%in%pos_reg & !duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
)[c(4,1,2,3),])
)))
| neither | both | colonizer | leaver |
|---|---|---|---|
| 332 | 172 | 36 | 2 |
Figure S3. Number of species beloning to the categories of both, neither, colonizer, leaver in each region
It’s the same pattern, whichever way you split it. However, AI is the only region that had more colonizers than both species. An interesting way to think about some of this is that the average sd in richness was 5.3135146, so when the number of colonizer or leaver species exceed’s that region’s sd, the impact of those categories, which I consider to be dubious, might start being relevant (though it’s not necessarily problematic, nor is this even close to an actual test for the significance of those categories to the trend). EBS and Shelf had significant positive trends in richness and very low numbers in the colonizer category. WCTRI and NEWF had similar numbers in the both and colonizer category.
col_ext_ts()
Figure S4. Number of colonizations (blue) and extinctions (red) over time in each region.
In most regions the differences in colonization and extinction numbers are similar over time. The most obvious exceptions are for the 3 regions that showed large initial spikes in richness; the GOA, GMEX, and AI regions initially have much larger numbers of colonizers than leavers, but this number shrinks rapidly until the two rates are ~equal.
For the regions with significant positive slopes, there is no visually obvious increase in colonizations relative to extinctions over time. Because the colonization and extinction numbers tend to track each other over the long-term, it it would be difficult to attribute the long-term changes in richness to a change in just colonization or extinction rates.
Manuscript paragraph:
A time series of richness can be decomposed into the colonizations and extinctions of individual species over time. We categorized species according to the following colonization extinction patterns: present in all years = neither (536 species), colonized and went extinct = both (263 species), initially absent but present every year after its colonization = colonizer (61 species), initially present but absent every year after its extinction = leaver (4 species). Most regions had the same overall ranking (neither > both > colonizer > leaver), except in the Northeast US where both was the most common and neither second, and in the Aleutian Islands where colonizer was the second most common and both third (Figure S4). In general, changes in richness were not due to permanent departures or introductions of species to the region. Furthermore, colonization and extinction rates did not become more dissimilar over time for any region (Figure S5). Colonizations were initially greater than extinctions in Aleutian Islands, Gulf of Alaska, and Gulf of Mexico, but this difference disappeared in the latter portion of these time series, as evidenced by these regions’ initially rapid increase in richness that later plateaued. The other regions did not show strong trends in the difference between colonizations and extinctions over time, making it difficult to attribute the long-term trends in richness to changes in just colonization or just extinction rates.
The result that richness is predicted by geographic range implied an underlying association between range, colonization/ extinction, and richness itself. This paper begins by explaining richness with range. It will end by explaining how range changes near a colonization/ extinction event. Here, between the two, I’ll show how the number of colonizations is related to range.
####Figure S5
ceEventRange()
Figure S5. Number of colonizations and extinctions as a function of range size and range density.
Yup, this is definitely a thing. Long-term average range size and range density predict how many colonizations and extinctions a species is likely to have. This will lead nicely into examing how range changes prior to an extinction or after a colonization.
ceRate_map(ce="colonization")
Figure 2. Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.
ceRate_map(ce="extinction")
Figure S7. Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.
Hotspots can be seen in most regions. Newfoundland also has high values around its edge (as opposed to interior), it seems. NEUS and Gmex show very strong hotspots, and other locations tend to be much much lower. Other regions show more of a continuum.
rel_col_ext_rate <- mapDat[,j={
map_smooth_col <- spatstat::Smooth(spatstat::ppp(x=.SD[,lon], y=.SD[,lat], marks=.SD[,n_spp_col_weighted], window=mapOwin[[reg]]), hmax=1)
mark_range_col <- range(map_smooth_col, na.rm=TRUE)*10
map_smooth_ext <- spatstat::Smooth(spatstat::ppp(x=.SD[,lon], y=.SD[,lat], marks=.SD[,n_spp_ext_weighted], window=mapOwin[[reg]]), hmax=1)
mark_range_ext <- range(map_smooth_ext, na.rm=TRUE)*10
ol <- list(
minval_col=mark_range_col[1], maxval_col=mark_range_col[2], max_o_min_col=do.call("/",as.list(rev(mark_range_col))),
minval_ext=mark_range_ext[1], maxval_ext=mark_range_ext[2], max_o_min_ext=do.call("/",as.list(rev(mark_range_ext)))
)
lapply(ol, function(x)if(is.numeric(x)){signif(x,3)}else{x})
},by=c("reg")]
kable(rel_col_ext_rate, caption="Table. The colonization and extinction intensity range and max/min ratio")
| reg | minval_col | maxval_col | max_o_min_col | minval_ext | maxval_ext | max_o_min_ext |
|---|---|---|---|---|---|---|
| ebs | 1.16e-01 | 0.554 | 4.79 | 1.20e-01 | 0.546 | 4.56e+00 |
| ai | 5.66e-02 | 0.483 | 8.53 | 0.00e+00 | 0.174 | 1.37e+09 |
| goa | 1.17e-03 | 0.764 | 652.00 | 1.75e-02 | 0.567 | 3.24e+01 |
| wctri | 2.32e-02 | 1.160 | 50.10 | 9.21e-03 | 0.951 | 1.03e+02 |
| gmex | 5.08e-01 | 3.180 | 6.25 | 7.27e-02 | 3.280 | 4.51e+01 |
| sa | 1.08e+00 | 2.180 | 2.03 | 7.58e-01 | 3.690 | 4.87e+00 |
| neus | 1.06e-01 | 22.500 | 213.00 | 8.73e-02 | 21.100 | 2.42e+02 |
| shelf | 4.81e-05 | 1.220 | 25400.00 | 5.74e-05 | 1.050 | 1.82e+04 |
| newf | 3.55e-02 | 0.582 | 16.40 | 1.07e-02 | 0.612 | 5.71e+01 |
kable(rel_col_ext_rate[,lapply(.SD, median)], caption="Table. Median of above table")
| reg | minval_col | maxval_col | max_o_min_col | minval_ext | maxval_ext | max_o_min_ext |
|---|---|---|---|---|---|---|
| neus | 0.0566 | 1.16 | 16.4 | 0.0175 | 0.951 | 57.1 |
nb_moranI(ce="colonization")
Figure S8. Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.
nb_moranI(ce="extinction")
Figure S9. Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.
rangeSize_absenceTime()
Figure 4. Geographic range size (A,B) and geographic range density (C,D) vs years until extinction (A,C) and years after colonization (B,D). For each unique value on the horizontal axis, the cross-species average for the range metric is displayed, and a linear model fit through this average. Statistics in main text do not use this aggregation. Extinction events are identified as occuring the year before the species is absent (?right?), colonization the first year it is present after an absence.
Range size declines near an absence much more consistently than does range density; both are (relatively) low just before extinction and just after colonization. However, range density has much more variable intercepts among regions, whereas range size does not.
This makes sense, at least somewhat, because colonization and extinction events are defined at the site level; though the outcome isn’t necessitated by this formulation, because size could drop suddenly. In fact, when a species is absent, both its range size and its range density must be 0 (though, range density is technically calculated for only those sites that are occupied, so I supposed it’s technically undefined according to the equations I’m using).
I think the regressions for range size should omit an intercept, while the regressions for range density should have it. This might be hard to justify fully a priori (though see my thinking in previous paragraph), so I’ll probably just do a model selection and maybe discuess the difference if one model has an intercept and the other does not.
rangeTimeDT <- spp_master[!is.na(stretch_type) & propStrata!=0]
rangeTimeDT <- rangeTimeDT[,list(
reg=reg,
event=as.character(event_year),
spp=spp,
type=as.character(stretch_type),
time=ext_dist,
size=propStrata,
density=propTow_occ
)]
sizeColExt_mods <- list()
sizeColExt_mods[[1]] <- lmer(size ~ time*type + (time*type | spp/reg), data=rangeTimeDT)
sizeColExt_mods[[2]] <- lmer(size ~ time*type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
sizeColExt_mods[[3]] <- lmer(size ~ time + type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
sizeColExt_mods[[4]] <- lmer(size ~ time + (time | spp/reg), data=rangeTimeDT) # this is what I settled on previously i think
sizeColExt_mods[[5]] <- lmer(size ~ time + (1 | spp/reg), data=rangeTimeDT)
sizeColExt_mods[[6]] <- lmer(size ~ time + (time - 1 | spp/reg), data=rangeTimeDT)
# do.call(stargazer, c(sizeColExt_mods, list(type="html")))
densityColExt_mods <- list()
densityColExt_mods[[1]] <- lmer(density ~ time*type + (time*type | spp/reg), data=rangeTimeDT)
densityColExt_mods[[2]] <- lmer(density ~ time*type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
densityColExt_mods[[3]] <- lmer(density ~ time + type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
densityColExt_mods[[4]] <- lmer(density ~ time + (time | spp/reg), data=rangeTimeDT)
densityColExt_mods[[5]] <- lmer(density ~ time + (1 | spp/reg), data=rangeTimeDT)
densityColExt_mods[[6]] <- lmer(density ~ time + (time - 1 | spp/reg), data=rangeTimeDT)
do.call(stargazer, c(sizeColExt_mods, densityColExt_mods, list(type="html")))
| Dependent variable: | ||||||||||||
| size | density | |||||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
| time | 0.007*** | 0.007*** | 0.007*** | 0.007*** | 0.005*** | 0.007*** | 0.003*** | 0.003*** | 0.003*** | 0.003*** | 0.002*** | -0.001 |
| (0.001) | (0.001) | (0.001) | (0.001) | (0.0002) | (0.001) | (0.0004) | (0.0005) | (0.0004) | (0.0004) | (0.0004) | (0.002) | |
| typepre_ext | 0.002 | 0.001 | 0.002 | 0.008 | 0.012 | 0.012 | ||||||
| (0.003) | (0.005) | (0.005) | (0.008) | (0.009) | (0.008) | |||||||
| time:typepre_ext | -0.001 | 0.0004 | 0.001 | -0.00000 | ||||||||
| (0.001) | (0.001) | (0.001) | (0.001) | |||||||||
| Constant | 0.062*** | 0.071*** | 0.070*** | 0.062*** | 0.073*** | 0.055*** | 0.423*** | 0.425*** | 0.426*** | 0.426*** | 0.429*** | 0.444*** |
| (0.004) | (0.010) | (0.010) | (0.004) | (0.006) | (0.002) | (0.010) | (0.050) | (0.050) | (0.010) | (0.010) | (0.004) | |
| Observations | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 | 5,726 |
| Log Likelihood | 7,313.790 | 7,305.312 | 7,311.481 | 7,291.964 | 6,563.036 | 6,809.122 | 2,604.312 | 2,802.634 | 2,808.608 | 2,589.062 | 2,581.763 | 1,586.657 |
| Akaike Inf. Crit. | -14,577.580 | -14,582.620 | -14,596.960 | -14,565.930 | -13,116.070 | -13,608.240 | -5,158.624 | -5,577.268 | -5,591.217 | -5,160.123 | -5,153.525 | -3,163.315 |
| Bayesian Inf. Crit. | -14,411.260 | -14,489.490 | -14,510.480 | -14,506.050 | -13,082.810 | -13,574.980 | -4,992.304 | -5,484.129 | -5,504.731 | -5,100.249 | -5,120.261 | -3,130.051 |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||||